Learn R Programming

DescTools (version 0.99.43)

Measures of Shape: Skewness and Kurtosis

Description

Skew computes the skewness, Kurt the excess kurtosis of the values in x.

Usage

Skew(x, weights = NULL, na.rm = FALSE, method = 3, conf.level = NA, 
     ci.type = "bca", R = 1000, ...)

Kurt(x, weights = NULL, na.rm = FALSE, method = 3, conf.level = NA, ci.type = "bca", R = 1000, ...)

Arguments

x

a numeric vector. An object which is not a vector is coerced (if possible) by as.vector.

weights

a numerical vector of weights the same length as x giving the weights to use for elements of x.

na.rm

logical, indicating whether NA values should be stripped before the computation proceeds. Defaults to FALSE.

method

integer out of 1, 2 or 3 (default). See Details.

conf.level

confidence level of the interval. If set to NA (which is the default) no confidence interval will be calculated.

ci.type

The type of confidence interval required. The value should be any subset of the values "classic", "norm", "basic", "stud", "perc" or "bca" ("all" which would compute all five types of intervals, is not supported).

R

The number of bootstrap replicates. Usually this will be a single positive integer. For importance resampling, some resamples may use one set of weights and others use a different set of weights. In this case R would be a vector of integers where each component gives the number of resamples from each of the rows of weights.

the dots are passed to the function boot, when confidence intervalls are calculated.

Value

If conf.level is set to NA then the result will be

a single numeric value

and if a conf.level is provided, a named numeric vector with 3 elements:
skew, kurt

the specific estimate, either skewness or kurtosis

lwr.ci

lower bound of the confidence interval

upr.ci

upper bound of the confidence interval

Details

Kurt() returns the excess kurtosis, therefore the kurtosis calculates as Kurt(x) + 3 if required.

If na.rm is TRUE then missing values are removed before computation proceeds.

The methods for calculating the skewness can either be: method = 1: g_1 = m_3 / m_2^(3/2) method = 2: G_1 = g_1 * sqrt(n(n-1)) / (n-2) method = 3: b_1 = m_3 / s^3 = g_1 ((n-1)/n)^(3/2)

and the ones for the kurtosis: method = 1: g_2 = m_4 / m_2^2 - 3 method = 2: G_2 = ((n+1) g_2 + 6) * (n-1) / ((n-2)(n-3)) method = 3: b_2 = m_4 / s^4 - 3 = (g_2 + 3) (1 - 1/n)^2 - 3

method = 1 is the typical definition used in many older textbooks. method = 2 is used in SAS and SPSS. method = 3 is used in MINITAB and BMDP.

Cramer et al. (1997) mention the asymptotic standard error of the skewness, resp. kurtosis:

ASE.skew = sqrt( 6n(n-1)/((n-2)(n+1)(n+3)) )
ASE.kurt = sqrt( (n^2 - 1)/((n-3)(n+5)) )

to be used for calculating the confidence intervals. This is implemented here with ci.type="classic". However, Joanes and Gill (1998) advise against this approach, pointing out that the normal assumptions would virtually always be violated. They suggest using the bootstrap method. That's why the default method for the confidence interval type is set to "bca".

This implementation of the two functions is comparably fast, as the expensive sums are coded in C.

References

Cramer, D. (1997): Basic Statistics for Social Research Routledge.

Joanes, D. N., Gill, C. A. (1998): Comparing measures of sample skewness and Kurt. The Statistician, 47, 183-189.

See Also

mean, sd, similar code in library(e1071)

Examples

Run this code
# NOT RUN {
Skew(d.pizza$price, na.rm=TRUE)
Kurt(d.pizza$price, na.rm=TRUE)

# use sapply to calculate skewness for a data.frame
sapply(d.pizza[,c("temperature","price","delivery_min")], Skew, na.rm=TRUE)

# or apply to do that columnwise with a matrix
apply(as.matrix(d.pizza[,c("temperature","price","delivery_min")]), 2, Skew, na.rm=TRUE)
# }

Run the code above in your browser using DataLab